Fragmenting XML Documents via Structural Constraints

نویسندگان

Angela Bonifati

Alfredo Cuzzocrea

Bruno Zinno

چکیده

XML query processors suffer from main-memory limitations that prevent them from processing large XML documents. While content-based predicates can be used to project down parts of the documents, it may still be needed to resize the obtained projections according to structural constraints. In this paper, we consider size, tree-width and tree-depth constraints to enable a structuredriven fragmentation of XML documents. Although a set of heuristics performing this kind of fragmentation can be easily devised, a key problem is determining the values of structural constraints input to the above heuristics, given that the search space is prohibitive at large. To alleviate the problem, we introduce special-purpose structure histograms that report the constraint values for the fragments of a given document. We then present a prediction algorithm that probes those histograms to output the expected number of fragments, when fixed input values of the constraints are used. Furthermore, we study how to relax the fixed constraints by means of classical distributions. An experimental evaluation of our study shows the effectiveness of our fragmentation methodology on some representative XML datasets.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

Structural Similarity Evaluation Between XML Documents and DTDs

The automatic processing and management of XML-based data are ever more popular research issues due to the increasing abundant use of XML, especially on the Web. Nonetheless, several operations based on the structure of XML data have not yet received strong attention. Among these is the process of matching XML documents and XML grammars, useful in various applications such as documents classifi...

متن کامل

Distribution Design for XML documents

The web is often seen as the world's largest database and XML is regarded to provide its data model. As XML data is naturally distributed across the web it should be considered as a distributed database and subject to distribution design. The main tasks of distribution design are fragmenting the underlying database schema and allocating the fragments to different sites. The aim of fragmentation...

متن کامل

QMatch - Using paths to match XML schemas

Integration of multiple heterogeneous data sources continues to be a critical problem for many application domains and a challenge for researchers world-wide. With the increasing popularity of the XML model and the proliferation of XML documents on-line, automated matching of XML documents and databases has become a critical issue. In this paper, we present a hybrid schema match algorithm, QMat...

متن کامل

Generating XML structure using examples and constraints

This paper presents a framework for automatically generating structural XML documents. The user provides a target DTD and an example of an XML document, called a Generate-XML-ByExample Document, or a GxBE document, for short. GxBE documents use a natural declarative syntax, which includes XPath expressions and the function count. Using GxBE documents, users can express important global and loca...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2006

Fragmenting XML Documents via Structural Constraints

نویسندگان

چکیده

منابع مشابه

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Structural Similarity Evaluation Between XML Documents and DTDs

Distribution Design for XML documents

QMatch - Using paths to match XML schemas

Generating XML structure using examples and constraints

عنوان ژورنال:

اشتراک گذاری